Distribution-based Clustering

Source

Distribution-based methods group data points together based on their likelihood of belonging to the same probability distribution.

Examples

GMM assumes that every data point is generated from multiple Gaussian distributions with unknown parameters
performs iterative Expectation Maximization (EM) steps to fit the data points
- In the Expectation (E) step: data points are assigned to clusters that assume randomly selected Gaussian parameters
- In the Maximization (M) step: the parameters of the Gaussian distribution are updated to best fit the data points in the cluster
It allows each example to be a member of several clusters with different membership score